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For  specified  functions  ^ and  ^ and  unknown  distribution  function  F 

with  density  f,  the  efficacy-related  parameter  T(f)  - J+(x)^( F(x))f  (x)dx 

may  be  estimated  by  the  sample  analogue  estimator  T(fQ)  based  on  an 

empirical  density  estimator  f . For  {X.}  i.i.d.  F and  f of  the  form 

n l n 

fQ(x)  ■ n x) » we  approximate  the  estimation  error  T(fQ)  - T(f) 

by  the  Giteaux  derivative  of  the  functional  T(*)  at  the  "point"  f with 


Increment  f - f.  In  conjunction  with  stochastic  properties  of  the 

Lj-norm  | | f ^ - f||,  this  approach  leads  to  characterisations  of  the 

stochastic  behavior  of  T(f  ) - T(f).  In  particular,  under  mild  assump- 

n 

tlons  on  f,  we  obtain  the  rate  of  strong  convergence  T(f  ) - T(f)  ■ 

n a. s. 

-h  h 

0(n  (log  n)  ) , which  significantly  Improves  previous  results  In  the 
literature.  Also,  we  establish  asymptotic  normality  with  associated 
Berry-Essden  rates. 


Kay  Phrases:  Nonparametrlc  estimation;  efficacy;  functionals  of  probability 
density;  strong  convergence;  asymptotic  distribution. 


1.  Introduction,  In  nonparametrlc  Inference  two  statistical  procedures 
are  ofetn  coopered  by  their  asynptotlc  relative  efficiency  (ABE) , which 
depends  on  efficacy  parameters  defined  in  terns  of  the  underlying  pro- 
bability distribution  of  the  data.  An  important  such  efficacy-related 
functional  is 

(1.1)  T(f)  - /♦(x)p(F(x))f2(x)dx, 

where  f is  the  underlying  probsbillty  density  function,  F is  the  corres- 
ponding cumulative  distribution  function  (cdf) , and  f and  p are  specified 
functions.  For  example,  for  the  case  p(x)  = p(x)  = 1,  this  functional 
reduces  to  / f(x)dx,  which  appears  as  a factor  in  the  Pitman  ARE  of  var- 
ious test  comparisons  involving  as  one  of  the  tests  the  Wllcoxon  rank 
sum  test,  or  the  Wllcoxon  signed  rank  test,  or  the  Kruskal-Vallls  test. 
Other  important  special  cases  of  (1.1)  are  Jx[2F(x)  - l]f*(x)dx, 

Jx{I(— , 0]  - 1(0,  •)}f2(x)dx,  and  /( d/dx)#”1(F(x))f2(x)dx,  where  ♦ denotes 
the  standard  normal  cdf.  Discussion  of  these  and  other  examples  may  be 
found  in  Purl  and  Sen  (1971) . 

Usually  little  is  assumed  known  about  the  underlying  probability 
density  f,  but  some  enlightenment  stay  be  gained  by  finding  the  lower 
bound  of  the  ARE  over  a specified  class  of  densities.  It  also  becomes 
of  interest  to  estimate  the  ARE  from  the  data.  In  this  connection,  we 
explore  In  this  paper  the  stochastic  behavior  of  certain  estimators  of 
the  functional  T(f)  defined  by  (1.1). 

For  the  special  case  / f2(x) dx,  a consistent  estimate  was  produced 


by  Lehmann  (1963)  as  a byproduct  of  an  investigation  using  the  signed 
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rank  test  to  construct  a confidence  Interval  for  the  location  shift 
parameter.  More  generally.  Sen  (1966)  proposed  estimators  for  T(f)  In 
the  case  that  4(x)  £ 1 or  *(x)  - x and  p(x)  - J'(x),  where  J la  a score 
function  defining  a rank  test,  and  established  weak  consistency  and 
asymptotic  normality  of  these  estimators  (under  regularity  conditions 
on  f and  J). 

Bhattacharyya  and  Roussas  (1969)  suggested  estimation  of  /f2(x)dx 
by  / f2(x)dx,  where  f Is  a kernel-type  empirical  probability  density  func- 
tion for  estimation  of  f based  on  a sample  of  slse  n from  f,  and  estab- 
lished convergence  of  this  estimator  In  the  first  and  second  means. 

Schuster  (1974)  investigated  strong  convergence  and  established  the  rate 
“1/3 

0(n  log  n).  He  also  Introduced  the  alternative  estimator,  / fn(x)dPQ(x), 

where  is  the  usual  empirical  cdf,  and  showed  that  the  two  estimators 

have  the  same  asymptotic  almost  sure  behavior.  Ahmad  (1976)  established 

asymptotic  normality  for  the  latter  estimator. 

EiClution  of  the  general  functional  (1.1)  has  been  considered  by 

Abend  and  Lin  (1976)  and  Winter  (1978) . Winter  employs  the  estimator 

T(fQ)  for  T(f),  with  fQ  as  above,  and  establishes  strong  convergence  with 
“1/3 

rate  0(n  (log  n)8n),  where  6q  ",  for  the  case  that  6 is  bounded  and 
♦ has  a bounded  derivative. 

In  the  present  treatment,  we  also  consider  estimation  of  T(f)  by 
the  sample  analogue  estimator  T(fQ) , but  we  allow  greater  flexibility 
in  the  choice  of  fQ  and  we  employ  a different  technique  for  analysis  of 
T(fn).  Specifically,  we  approximate  T(fQ)  - T(f)  by  the  Gateaux  derlva> 
the  functional  !(•)  at  the  point  f with  increment  f - f.  By 
this  method  we  are  able  to  establish  significantly  Improved  rates  of 
strong  convergence,  namely  0(n  ^(log  n)^)  and  under  soate  conditions 

■Jr  JL 

0(n  (log  log  n)’)#  the  latter  probably  optimal.  The  method  also  yields 


asymptotic  normality  along  with  associated  Berry-Essden  rates.  Further- 
more, we  are  able  to  relax  the  restrictions  on  $ and  4 Imposed  by  previous 
authors . 

The  basic  notation,  assumptions  and  method  are  presented  in  Section 
2.  The  special  case  4 (*)  =♦(*)=  1 and  f square  integrable  is  treated 
in  Section  3.  In  Section  4 direct  extensions  to  the  following  cases  are 
discussed ; (a)  f has  bounded  support,  4 is  continuous,  4 has  bounded 

second  derivative;  (b)  f is  square  integrable,  4 is  bounded,  4 has  bounded 
second  derivative.  Section  S treats  the  general  case,  dropping  all  major 
restrictions  on  f and  4,  but  at  the  expense  of  making  the  estimator  some- 
what more  complicated.  In  Section  6 we  consider  two  specific  examples 
of  estimators  of  the  simple  density  functional  / f (x)dx  and  point  out 
certain  computational  approaches. 

2.  The  basic  approach.  Let  {X^}  be  Independent  random  variables 
having  density  function  f.  Let  f be  an  empirical  density  function 

A 

based  on  X. , . . . , X , and  let  F denote  the  associated  cdf  obtained  by 
i n n 

integration  of  fQ. 

He  consider  estimation  of  the  functional  T(f)  defined  by  (1.1)  by 
T(fQ).  Following  von  Mises  (1947),  let  us  approximate  the  estimation 
error  T(fQ)  - T(f)  by  an  appropriate  Giteaux  derivative.  For  an  arbi- 
trary functional  T(*)»  the  G&teaux  derivative  of  T(<)  at  the  point  f 
with  Increment  g - f,  where  f and  g are  "points"  in  the  space  of  density 
functions,  is  defined  as 

T(fi  g - f)  - jjjT((l  - »f  + 

— — — d 
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For  g sufficiently  close  to  f,  T(f;  g - f)  serves  as  an  approximation  to 

T(g)  - T(f).  In  particular,  for  T(*)  given  by  (1.1)  and  for  g - f , ve 

n 

find 

(2.1)  T(f ; fn  - f)  - 2/4(x)*(F(x))f(x)[fn(x)  - f(x)]dx 

+ /♦(»)♦’ (F(x))f2(x)[FQ(x)  - F(x)  ]dx, 

assuming  that  p is  differentiable. 

The  usefulness  of  (2.1)  will  depend  in  part  on  properties  of  f . 

n 

We  shall  assume  that  f has  the  form 

n 

<A1)  £ (x)  ■ n_1  l f (x), 

i-1  ni 

where  the  i-th  function  fQ^  depends  only  on  the  i-th  random  variable 
and  on  n.  For  example,  this  structure  Includes  the  kernel  type  fQ  in 
which  ffti  is  of  the  form  fQi(x)  ■ cn2K(cn1(x  - X^)),  where  K is  a speci- 
fied "kernel"  function  and  (c  } is  a sequence  of  constants  tending  to  0. 

{ 

Sometimes  we  shall  assume  in  addition  that 

(A2)  fni  * £ii*  1 * i * n,  n - 1.  2 

4 

which  makes  (f  } computable  recursively:  f - n”2[(n  - l)f  , + f ]. 

n n n-1  nn 

That  is,  the  n-th  stage  function  depends  on  Xj,  ...,  X^  only  through 

the  result  of  the  (n  - l)-th  stage  computation. 

A key  feature  of  fQ  due  to  (Al)  is  its  structure  as  an  average  over 

the  independent  elements  of  the  n-th  row  of  a double  array  of  random 

variables . By  (2 . 1) , we  readily  see  that  this  feature  applies  as  well 

to  the  structure  of  T(f;  f - f ) : 

n 
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(2.2) 


T(f;  £ - £)  - n”1  l T(f;  £ - f). 


Thus  we  may  handle  T(f;fn  - f)  by  routine  application  of  classical  pro- 
bability theory  for  sums.  In  the  recursive  case,  that  Is,  when  (A 2) 
holds  also,  there  Is  a further  slmpllcatlon : the  problem  reduces  to 
averaging  over  a single  sequence  of  random  variables. 

The  usefulness  of  (2.1)  will  depend  also  upon  negligibility  of  the 

approximation  error  T(f  ) - T(f)  - T(f;  f - f).  In  order  to  show  that  this 

n n 

_t  -k  k 

quantity  Is  0p(n  ),  or  almost  surely  (a. a.)  O(n,(log  log  n)’) , or  the 

like,  we  shall  use  the  following  "differential  Inequality."  Let  ||h||p 

denote  for  0 < p < • the  Lp-norm  (J|h|*>)^*>  and  for  p ■ the  sup-norm 

supx|h(x)| . 

LEMMA  2.1.  Let  T be  given  by  (1.1).  Assume  that  either  (a)  f has 
bounded  support  and  4 is  oontinuoue , or  (b)  f is  equate  integrable  and  4 
is  bounded.  Assume  that  4 has  bounded  second  derivative.  Then 

(2.3)  |I(fo)  - 1(f)  - 1(1,  fn  - f|  i Cjll^  - £|||  + CjH^  - r||*. 

where  c^  and  c^  are  constants  depending  on  f,  4 and  4 but  not  on  fR. 

Further,  in  ease  (a)  we  may  take  c2  ■ 0. 

The  proof  is  routine  and  omitted.  We  will  exploit  the  lemma  by  assuming  that 
f and  (fg)  are  such  that  the  following  conditions  hold: 


- 'll)  V.. 
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Conditions  for  (Bl)  have  been  investigated  by  Cheng  and  Serfling  (1979) 
for  kernel  type  fn«  Each  of  the  following  ia  a sufficient  requirement 
on  f:  (i)  f possesses  a bounded  continuous  •)  derivative; 

(ii)  f i*  Lipschits  on  •»)  and  satisfies  a tail  restriction  of  form 
/|x|>tf(*)dx  " Ott”*1),  t -►  •,  for  some  q > 0;  (iii)  the  characteristic 
function  of  f decreases  algebraically  of  degree  p > 0,  in  the  sense 

Par sen  (1962)  and  Natson  and  Leadbetter  (1963) . In  each  case  a suitable 
choice  of  kernel  K and  constants  {c^}  can  be  made  so  as  to  achieve  (Bl). 

Conditions  for  (B2)  follow  from  work  of  Winter  (1979) , who  establishes 
for  suitable  f the  stronger  property  n*||F  - P| | • 0((log  log  n)*) , 

under  the  assumption  that  f possesses  a bounded  derivative. 

It  will  also  be  of  interest,  in  connection  with  Berry-Escden  rates, 
to  have  f and  fR  satisfy 

(C)  *<n*l|fn  ‘ III*  > »n)  - °(an), 

for  a sequence  of  constsnts  aQ  tending  to  0.  The  work  of  Cheng  end 
Serfling  noted  above  also  provides  (C)  under  conditions  similar  to  (i) , 

a O 

(ii),  (iii).  However,  the  analogue  of  (C)  for  1 1 F - f| I has  not  been 

n • 

investigated  at  this  point. 

In  dealing  in  Section  5 with  the  general  case  of  T(f)  with  f and  + 

unrestricted,  our  estimator  will  be  a truncated  version  of  T(fn) , namely 

tn 

Tn(fn)  “ /♦<*>*<y*>>fn<*>d*. 

Cn 

where  tQ  is  a sequence  of  constants  tending  to  •».  The  corresponding 

Gateaux  derivative  of  T (•)  at  f with  increment  f - f is  a simUlar 

n n 


'•  -* 


. 
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truncation  of  T(f;  f • f). 

n 

Both  T(fQ)  and  Tn(fQ)  are  Borel  measurable  if  fQ  is  a kernel  type 

estimator.  In  the  sequel  we  assume  that  T(f  ),  T (f  ),  T(f;  f - f)  and 

n n n n 

TQ(f ; fQ  - f)  are  Borel  measurable  without  further  mention. 

The  following  notation  will  be  needed: 

V ■ E(T<f'  £„i  - £».  % - 

°nl“  V*rtT(f’  £„1  - £»<  °1  ' »_1  l 

i-1 

»n1('’)  • ElT<f>  £„i  - £)  - “nir.  r„(v)  - .■1h11(»). 

3.  The  case  ♦=♦=!.  In  this  section  the  target  functional  is 
simply  T(f)  - Jf  (x)dx.  Voder  the  general  assumptions  (A) , (B) , and  (C) 
on  f and  fQ,  discussed  in  Section  2,  we  characterize  the  stochastic  be- 
havior of  T(fQ) . Theorem  1 provides  the  rate  of  a.s.  convergence.  Theo- 
rem 2 provides  asymptotic  normality  along  with  an  associated  Berry-Essden 
rate.  The  hypotheses  of  the  theorems  will  also  entail  restrictions  directly 
imposed  on  the  quantities  u , a , y (v)  , etc.  These  conditions  will  be 

u u n 

further  discussed  at  the  conclusion  of  this  section. 

THEOREM  l.  Let  f and  fQ  satisfy  (Al)  and  (Bl).  Assume  also  that 

(3*1)  J|fni(x)|dx  s C,  all  i and  n, 

and 

(3.2)  • o(n“*(log  n)*),  n 
Than 

(3.3)  I T(f  ) - T(f) | - O(n"*<log  n)*)  , n -►  •. 

U m so  s 

If,  also,  (A2)  holds,  pQ  - o(n~*(log  log  oft)  , and  o2  o2,  0 o2  < •, 


C3.4) 


11m 5-= 

tr*«  (2 o n log  log  n) 


% 


a. 8. 


1. 


PROOF . By  Lemma  2 . 1 and  (Bl) , we  have 
(3*5)  n*|T(f  ) - T(f)  - T(f;  f - f ) | -*•  0. 

n n A*  8 e 

In  view  of  (3.2),  to  complete  the  proof  of  (3.3)  It  suffices  to  show 

(3.6)  | T(f ; f - f)  - E{T(f;  f - f))|  - 0(n“*(log  n)*) . 

U U Be  8 • 

By  (2.1)  and  (2.2),  represent 

(3.7)  T(f ; f - f)  - n'1  f 2/f(x)(f  .(x)  - f(x)]dx. 

1-1 

By  (3.1),  the  summands  In  (3.7)  are  bounded  random  variables,  say  bounded 
by  B.  Therefore,  by  Theorem  2 of  Hoeffdlng  (1963),  we  have 

P(|T(f;  fQ  - f)  - E{T(f ; fQ  - f))|  it)  S2  exp(-2nt2/B2) , 

from  which  (3.6)  follows  by  the  Borel-Cantelll  lemma. 

On  the  other  hand,  if  fQ  satisfies  (A2),  then  T(f;  f - f)  may  be 
regarded  as  the  partial  sum  of  Independent  bounded  random  variables. 

Thus  (3.4)  follows  from  (3.5),  - o(n“*(log  log  n)*) , and  the  law  of 

the  Iterated  logarithm  of  Kolmogorov  (1929).  □ 

THEOREM  2.  Let  f and  fR  aatiufy  (Al)  and  (Bl).  Assmns  also  that 

[3, 


3 
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mdt  for  some  v > 2, 


(3.8c) 


nYn('')/(no^)*V  0. 


(3.9) 


n*[T(fn)  - T(f  ) ] N(0,  o2). 


If,  also,  (C)  holds  for  a sequence  {an>  such  that  n ■ 0(an) » end 
ny  (3)/(no2)^2  » 0(n”^) , then  (for  4 the  standard  normal  cdf) 


(3.10)  8upt|P(n*[T(fn)  - T(f) ] St)-  4(t)|  - 0(an). 


PROOF.  We  use  the  following  well-known  device.  For  any  sequences 


of  random  variables  {£  } and  (n  ) and  sequence  of  positive  constants  {a  }, 

n n n 


8upJPUn  S 0 "•<*>!  s »«PtlP(nn  5 ° " *(t)l  + 0(an)  + P(lcn  “ nJ  2 an)' 


By  this  inequality  and  an  argument  similar  to  that  for  Theorem  1,  we  re- 


duce the  problem  to  an  application  of  standard  central  limit  theory  for 


double  arrays . □ 


As  will  be  seen  below.  It  suffices  for  (3.8),  and  thus  for  (3.2) 


also,  that  f have  a bounded  second  derivative.  (Of  course.  It  Is  under- 


stood that  fQ  must  be  suitably  chosen,  also.)  If,  further,  f"  Is  a contin- 


uous L0(—,  •»)  function  and  f is  of  suitable  kernel  type,  then  (Bl) 
< n 


holds  and  (C)  holds  with  a ■ 0(n”^*‘®+e^ , any  e > 0.  For  details  on 

n 


the  latter,  see  Cheng  and  Serf ling  (1979). 


We  now  give  conditions  on  f and  f sufficient  for  the  properties 

n 


• o(n"*) , 


J 


(3.11) 
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(3.12)  a2  +02,  0 < o2  < •», 

(3.13)  Y„(3)/nV  - 0(n‘*) . 

u n 

We  confine  attention  to  f of  kernel  type. 

LEMMA  3.1.  Let  f have  bounded  eecond  derivative.  Assume  that  both 
/♦(*)♦( F(x))f(x)dx  and  J$(x)<j»'( F(x))f2(x)dx  are  finite.  Let  K eatiefy 
/zK(z)dz  - 0 and  /z2|K(z)|dz  < «,  and  euppoee  cq  - 0(n~*).  Then  (3.11) 
holds. 

LEMMA  3.2.  Let  f be  bounded  and  continuous,  let  4 be  bounded  and 

2 

continuous,  and  let  4 have  bounded  derivative.  Then  on  hae  finite  posi- 
tive limit  and  Y„(3)  is  bounded. 
n 

The  proofs  are  routine  and  may  be  found,  with  related  results.  In 
Cheng  (1979). 

4.  Some  direct  extensions.  Here  we  Indicate  extensions  of  Section  1 
In  two  directions.  For  the  first  case  we  assume 

f has  bounded  support,  say  In  [a,  b] ; 

4 Is  continuous; 

4 has  bounded  second  derivative. 

We  also  assume  that  the  empirical  density  function  f has  support  in  [a,  b] 

n 

for  large  n,  which  can  be  arranged  by  taking  f to  be  of  kernel  type 
with  kernel  function  having  bounded  support.  Under  these  assumptions. 
Theorems  1 and  2 of  Section  3 carry  over  unchanged  and  by  means  of  similar 
proofs . 
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1 


Next  we  assume,  alternatively,  that 

f la  square  lntegrable; 
i la  bounded; 

♦ has  bounded  second  derivative. 

In  this  case  both  terms  In  (2.3)  are  relevant,  so  that  condition  (B2) 
comes  Into  action.  With  appropriate  modifications  In  this  respect,  again 
the  assertions  of  Section  3 carry  over. 

5.  The  general  case.  In  this  section  we  slmultsneously  remove  the 

conditions  on  f and  drop  the  restrictions  on  the  support  of  f.  We  assume 

only  that  f Is  square  lntegrable,  and  we  retain  the  assumption  that  ^ 

has  bounded  second  derivative.  Instead  of  the  eatlumtor  T(f  ),  we  employ 

n 

the  truncated  version  defined  In  Section  2,  and  we  Introduce  the  function 

H(tn>  " 8Up|x|«  •♦Wl* 
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If  there  exists  a choice  of  t such  that 

n 

(5.2)  J|x|>t  ♦(*)♦( F(x)) f 2 (x)dx  ■ O(n~*(log  n)H(tn)), 

then  Tn(f)  may  be  replaced  by  T(f)  In  (5.1). 

Similarly,  by  replacing  u and  o by  u and  o and  adding  condition 

n u n n 

(B2)  In  Theorem  2,  the  result  carries  over  with  the  assertion: 


<5-3>  n%lTn(fn)  " Tn(f)1  *d  N((*  °2>* 

If  the  left-hand  side  of  (5.2)  is  o(n"*) , then  TR(f)  may  be  replaced 
by  T(f)  in  (5.3). 

Examples  and  computations.  In  this  section  we  confine  attention 
to  the  case  T(f)  m ff  (x)dx  and  consider  to  be  of  kernel  type,  xvo 
choices  of  kernel  K will  be  considered. 

EXAMPLE  1.  The  uniform  density  as  kernel  flotation.  Define  K(x)  - * 
if  I x | £ 1,  and  K(x)  *0  otherwise.  Then,  following  an  argument  of 
Bhattacharyya  and  Rousoas  (1969) , T(fQ)  may  be  expressed  as  a linear 
combination  of  order  statistics, 

T<£0>  - Ony'1  ♦ *<•«„)-%,, <2cn  - |Xj  - Xj), 

where  !(*)  devotes  summation  over  all  1 £ 1 £ J s n such  that  |Xt  - X^ | 

s ^cn • 1^  i has  a bounded  second  derivative  which  Is  a continuous  •) 

function,  and  if  cn  • An  1^5,  then  by  results  of  Cheng  and  Serf Ting  (1979) 

the  conditions  of  Theorems  1 and  2 hold  and  we  have  T(f  ) - T(f)  - 

n a. a. 

Jt  t L 

0(n  (log  n)  ) as  well  as  n*[T(fn)  - T(f)J  N(0,  «/f(x)[f(x)  - I(f)J2dx.  □ 
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EXAMPLE  2.  The  triangular  function  as  a kernel  function . Define 
K(x)  • 1 - i if  0 a s 1,  ■ 1 + x if  -1  s x s 0,  • 0 otherwise.  It  can 
be  shown  that  T(fQ)  nay  be  represented  as  a polynomial  function  of  the 
differences  |X^  - X^|.  Also,  the  sane  assertions  of  a.s.  convergence  and 
asymptotic  normality  as  in  the  preceding  example  apply.  □ 
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